Hands-on Exercise 3: Processing and Visualizing Flow Data

Author

Kristine Joy Paas

Published

December 1, 2023

Modified

December 2, 2023

Overview

We will visualize flow data

I am running through most of the code here as the wrangling steps are very similar to the wrangling done in Take Home Ex1. No need to analyze each and every step as I already learned those.

Getting Started

We will load the library used for this exercise.

pacman::p_load(tmap, sf, DT, stplanr,
               performance,
               ggpubr, tidyverse)

Preparing flow data

Importing of data

odbus <- read_csv("data/aspatial/origin_destination_bus_202310.csv")
odbus$ORIGIN_PT_CODE <- as.factor(odbus$ORIGIN_PT_CODE)
odbus$DESTINATION_PT_CODE <- as.factor(odbus$DESTINATION_PT_CODE) 
head(odbus)
# A tibble: 6 × 7
  YEAR_MONTH DAY_TYPE   TIME_PER_HOUR PT_TYPE ORIGIN_PT_CODE DESTINATION_PT_CODE
  <chr>      <chr>              <dbl> <chr>   <fct>          <fct>              
1 2023-10    WEEKENDS/…            16 BUS     04168          10051              
2 2023-10    WEEKDAY               16 BUS     04168          10051              
3 2023-10    WEEKENDS/…            14 BUS     80119          90079              
4 2023-10    WEEKDAY               14 BUS     80119          90079              
5 2023-10    WEEKDAY               17 BUS     44069          17229              
6 2023-10    WEEKENDS/…            17 BUS     20281          20141              
# ℹ 1 more variable: TOTAL_TRIPS <dbl>

Extracting the study data

odbus6_9 <- odbus %>%
  filter(DAY_TYPE == "WEEKDAY") %>%
  filter(TIME_PER_HOUR >= 6 &
           TIME_PER_HOUR <= 9) %>%
  group_by(ORIGIN_PT_CODE,
           DESTINATION_PT_CODE) %>%
  summarise(TRIPS = sum(TOTAL_TRIPS))
datatable(odbus6_9)
write_rds(odbus6_9, "data/rds/odbus6_9.rds")
odbus6_9 <- read_rds("data/rds/odbus6_9.rds")

Working with Geospatial Data

Importing geospatial data

busstop <- st_read(dsn = "data/geospatial",
                   layer = "BusStop") %>%
  st_transform(crs = 3414)
Reading layer `BusStop' from data source 
  `/Users/kjcpaas/Documents/Grad School/ISSS624/Project/ISSS624/Hands-on_Ex3/data/geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
mpsz <- st_read(dsn = "data/geospatial",
                   layer = "MPSZ-2019") %>%
  st_transform(crs = 3414)
Reading layer `MPSZ-2019' from data source 
  `/Users/kjcpaas/Documents/Grad School/ISSS624/Project/ISSS624/Hands-on_Ex3/data/geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS:  WGS 84
mpsz <- write_rds(mpsz, "data/rds/mpsz.rds")

Geospatial data wrangling

Combining Busstop and mpsz

busstop_mpsz <- st_intersection(busstop, mpsz) %>%
  select(BUS_STOP_N, SUBZONE_C) %>%
  st_drop_geometry()
datatable(busstop_mpsz)
write_rds(busstop_mpsz, "data/rds/busstop_mpsz.rds")  
od_data <- left_join(odbus6_9 , busstop_mpsz,
            by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
  rename(ORIGIN_BS = ORIGIN_PT_CODE,
         ORIGIN_SZ = SUBZONE_C,
         DESTIN_BS = DESTINATION_PT_CODE)
duplicate <- od_data %>%
  group_by_all() %>%
  filter(n()>1) %>%
  ungroup()
od_data <- unique(od_data)
od_data <- left_join(od_data , busstop_mpsz,
            by = c("DESTIN_BS" = "BUS_STOP_N")) 
duplicate <- od_data %>%
  group_by_all() %>%
  filter(n()>1) %>%
  ungroup()
od_data <- unique(od_data)
od_data <- od_data %>%
  rename(DESTIN_SZ = SUBZONE_C) %>%
  drop_na() %>%
  group_by(ORIGIN_SZ, DESTIN_SZ) %>%
  summarise(MORNING_PEAK = sum(TRIPS))
write_rds(od_data, "data/rds/od_data.rds")
od_data <- read_rds("data/rds/od_data.rds")

Visualizing Spatial Interaction

Removing intra-zonal flows

od_data1 <- od_data[od_data$ORIGIN_SZ!=od_data$DESTIN_SZ,]

Creating desire lines

In this code chunk below, od2line() of stplanr package is used to create the desire lines.

flowLine <- od2line(flow = od_data1, 
                    zones = mpsz,
                    zone_code = "SUBZONE_C")

Visualizing the desire lines

To visualise the resulting desire lines, the code chunk below is used.

tm_shape(mpsz) +
  tm_polygons() +
flowLine %>%  
tm_shape() +
  tm_lines(lwd = "MORNING_PEAK",
           style = "quantile",
           scale = c(0.1, 1, 3, 5, 7, 10),
           n = 6,
           alpha = 0.5,
           col = "red")

tm_shape(mpsz) +
  tm_polygons() +
flowLine %>%  
  filter(MORNING_PEAK >= 5000) %>%
tm_shape() +
  tm_lines(lwd = "MORNING_PEAK",
           style = "quantile",
           scale = c(0.1, 1, 3, 5, 7, 10),
           n = 6,
           alpha = 0.5,
           col = "red")

Reflections

  • While doing the exercise, I was interested in doing analysis of the flow from origin to destination to figure out the movement patterns of people. However, we do not know where people are going so even when I had hypothesis like people come from the same place in mornings (their home), whether weekend or weekday while it’s not true in the evenings as they come from school or office, I couldn’t verify it with LISA and EHSA they analyze only 1 data parameter (like TRIPS) in Take home ex 1

  • I am too tired so I didn’t have much time and energy to work on this (because of Take Home EX 1. Good thing the exercise is shorter and I already understand the data wrangling part so I can focus more on the flow analysis.